Predoop: Preempting Reduce Task for Job Execution Accelerations

نویسندگان

  • Yi Liang
  • Yufeng Wang
  • Minglu Fan
  • Chen Zhang
  • Yuqing Zhu
چکیده

Map/Reduce is a popular parallel processing framework for data intensive computing. For overlapping the Map task’s execution phase and the Reduce task’s intermediate data fetching and merging phase, existing Map/Reduce schedulers always pre-launch the Reduce task at the specific threshold where its map tasks have been launched, and this pattern incurs the occupation of the consuming resources of the reduce task during its idle time on waiting for fetching the intermediate data from map tasks. To address this issue, we propose an extension version of Hadoop map/reduce framework, called Predoop, in this paper. The basic idea of Predoop is to preempt the reduce task during its idle time and allocate the released resource to the map tasks on schedule. To achieve this goal, first, we introduce the preemptive mechanism for reduce tasks and map tasks respectively to enable Map/Reduce tasks to be preempted or resumed with correct status; second, we adopt the preempting-resuming model for the reduce task with the consideration of the progress of Reduce task data fetching & merging and the Map task execution so as to determine the timing of Reduce task preemption and resuming; third, we introduce the preemption-aware task scheduling strategy to allocate the released resources to the on-schedule Map tasks with the consideration of data locality. Experimental result demonstrates that Predoop outperforms Hadoop on various workload and the average job turnaround time can be reduced by maximum of 66.57%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Cross-Jobs-Cross-Phases Map-Reduce Scheduling Algorithm in Heterogeneous Cloud

To fast process the large-scale data, map-reduce cloud is viewed as a very reasonable and effective platform. According to the new scheduling challenges in map-reduce cloud, a cross-jobs-cross-phases (CJCP) map-reduce scheduling algorithm is proposed in this paper. CJCP mainly consists of four optimal schemes, and respectively deals with four resource waste scenes of the job scheduling process....

متن کامل

Resource Provisioning based on Preempting Virtual Machines in Resource Sharing Environments

Resource provisioning is one of the main challenges in large-scale resource sharing environments such as federated Grids. Recently, many resource management systems in these environments have started to use the lease abstraction and virtual machines (VMs) for resource provisioning. In resource sharing environments resource providers serve requests from external users along with their own local ...

متن کامل

Resource provisioning based on preempting virtual machines in distributed systems

Resource provisioning is one of the main challenges in large-scale distributed systems such as federated Grids. Recently, many resource management systems in these environments have started to use the lease abstraction and virtual machines (VMs) for resource provisioning. In the large-scale distributed systems, resource providers serve requests from external users along with their own local use...

متن کامل

Solving Task Scheduling Problem in Cloud Computing Environment Using Orthogonal Taguchi-Cat Algorithm

Received Jan 9, 2017 Revised Mar 15, 2017 Accepted Apr 8, 2017 In cloud computing datacenter, task execution delay is no longer accidental. In recent times, a number of artificial intelligence scheduling techniques are proposed and applied to reduce task execution delay. In this study, we proposed an algorithm called Orthogonal Taguchi Based-Cat Swarm Optimization (OTB-CSO) to minimize total ta...

متن کامل

Reducing Execution Waste in Priority Scheduling: a Hybrid Approach

Guaranteeing quality for differentiated services while ensuring resource efficiency is an important and yet challenging problem in large computing clusters. Priority scheduling is commonly adopted in production systems to minimize the response time of high-priority workload by means of preempting the execution of low-priority workload when faced with limited resources. As a result, the system p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014